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Abstract 

Random hashing is a standard method to balance loads among nodes in P2P networks. However, 
hashing destroys locality properties of object keys, the critical properties to many applications, more 
specifically, those that require range searching. To preserve a key order while keeping loads balanced, 
Ganesan, Bawa and Garcia-Molina proposed a load-balancing algorithm that supports both object 
insertion and deletion that guarantees a ratio of 4.237 between the maximum and minimum loads 
among nodes in the network using constant amortized costs. However, their algorithm is not straight- 
forward to implement in real networks because it is recursive. Their algorithm mostly uses local 
operations with global max-min load information. In this work, we present a simple non-recursive 
algorithm using essentially the same primitive operations as in Ganesan et al.'s work. We prove that 
for insertion and deletion, our algorithm guarantees a constant max-min load ratio of 6.373 with 
constant amortized costs. 

1 Introduction 

One of important issues in peer-to-peer networks is load balancing. Load balancing is a method that 
balances loads among nodes in the networks. In Peer-to-Peer networks, a standard technique usually 
used to spread keys over nodes is hashing. The research on the construction of distributed hash tables 
(DHTs) is very active recently. However, hashing destroys locality properties of keys. This makes many 
applications difficult to support a range searching. 

Ganesan, Bawa and Garcia-Molina |4| proposed a sophisticated load-balancing algorithm on top 
of linearly ordered buckets. Since the ordering of keys is preserved, an item searching is simple and 
a range searching is naturally supported. Their algorithm, called the AdjustLoad algorithm, uses 
two basic balancing operations: NbrAdjust and Reorder with global max-min information, and can 
maintain a good ratio of 4.237 between the maximum and minimum loads among nodes in the network, 
while requiring a constant amortized work per operation. We give the description of their algorithm in 
Section 13.31 

Although both operations are easy to state, the AdjustLoad algorithm is recursive; thus, it is not 
straight-forward to implement in distributed environments. Also, in the worst case, the AdjustLoad 
algorithm docs not guarantee the number of invoked balancing operations, the number of updated data 
(partition change and load information) and the number of affected nodes. Since each balancing operation 
may require global information, the number of global queries may be higher than the number of insertions 
and deletions. 
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In this paper, we present a simpler, non-recursive load-balancing algorithm that uses the same prim- 
itive operations, NbrAdjust and Reorder as in Ganesan et ai, but for each insertion or deletion, 
primitive operations are called at most once. This also implies that our proposed algorithm only makes 
queries to global at most once per insertion or deletion. 

As in Ganesan et ai, we prove that the ratio between the maximum load and the minimum load, the 
imbalance ratio, is at most 6.373 and the amortized cost of the algorithm is a constant per operation. 

Our algorithm uses two high-level operations, MinBalance and Split: 

1. The MinBalance operation occurs when there is an insertion at some node u causing the load 
of u to be too high; in this case, we shall take the node v with the minimum load, transfer its load 
to one of v's neighbors with a lighter load, and let v share half the load with u. 

2. The Split operation occurs when there is a deletion at some node u causing the load of u to be 
too low; in this case, we either let u take some load from one of its neighbors, or transfer all it's 
load to its neighbors and let it share half the load with the maximum loaded node. 

Overview of the techniques. When considering only insertion, the key to prove the imbalance 
ratio is to analyze how the load of neighbors of the minimum loaded node changes over time. Let z 
denote the lightly- loaded neighbor of v. The bad situation can occur when z takes entire loads of the 
minimum loaded nodes (at various points in time) for too many times; this could cause the load of z to 
be too high compared of the minimum load. We show that this is not possible because for each time a 
load is transferred to z, the minimum load is increased to within some constant factor of the load of z; 
thus keeping the imbalance ratio bounded. 

We make an important assumption in our analysis of the insertion-only case, namely, that the min- 
imum load can never decrease. For the general case, this is not true. We maintain the general analysis 
framework by introducing the notion of phases such that within a phase the minimum load remains 
monotonically non-decreasing. To prove the result in this case, we show that inside a phase, the imbal- 
ance ratio is small; and when we enter a new phase (i.e., when the minimum load decreases), we are in 
a good starting condition. 

Organization. The remainder of the paper is organized as follows. In Section [21 we discuss related 
work. Section[3]describes the model, states the basic definitions and reviews the AdjustLoad algorithm. 
In Section [U we consider the insert-only case. We propose the MinBalance operation, analyze the 
imbalance ratio and calculate the cost of the algorithm. In Section [51 we consider the general case, which 
has both insertion and deletion. We propose the Split operation that used when a deletion occurs and 
analyze the imbalance ratio and the cost of the load-balancing algorithm. 



2 Related work 

Research on complex queries in P2P networks has long been an interesting problem. It started when 
Harren et al. |5j argued that the complex queries are important open issues in P2P networks. After 
that, there have been many research efforts on search methods in P2P networks. For more information, 



readers are referred to the survey on searching in P2P networks by Risson and Moors [15|. 

Range searching is one of the search methods that arises in many fields including P2P systems. Since 
most data structures for distributed items are based on distributed hash tables (DHTs) [IIIillllJllEil, 



early work focused mainly on building range search data structures on top of DHTs, e.g., PHT [13j and 
DST [2lj], based on binary search trees and segment trees, respectively. These data structures do not 
address a load-balancing mechanism when supporting insertion and deletion. Moreover, hot spots may 
occur when loads are highly imbalanced. 

There are other data structures for range searching in P2P networks, which are not based on DHTs. 
SkipNet Q, which adapts from Skip Lists [l2^ . can support range searching or load balancing but not 
both. Skip Graphs 1] also adapts from Skip Lists and addresses on the range searching but it does not 
address on load balancing on the number of items per nodes. 

Many data structures support efficient range queries and show a good load balancing property in 
experiments, e.g., Mercury 0], Baton 0], Chordal graph j^j, Dak [l8[ and Yarqs However, they do 
not have any theoretical guarantees on load distribution among nodes in their data structures. 
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There are mainly two groups of researchers trying to address both range searching and load-balancing 
theoretically. The former is the group of Karger and Ruhl 0, [l(| ■ The latter is the group of Ganesan 
and Bawa |3] and Ganesan, Bawa and Garcia- Molina Both of them use similar two operations. The 
first operation balances loads between two consecutive nodes by transferring the load of higher load to 
the lower load. For the second operation, a node i transfers its entire load to its neighbor and relocates 
its position to share load with a node j in a new position. 

Karger and Ruhl [{J [l(| presented the randomized protocol, where each node chooses another node 
to perform a balancing operation at random. Load balancing operations should be performed regularly, 
even when there is no insertion or deletion at the node. More precisely, they showed that if each node 
contacts (log n) other nodes then the load of each node is between ■ L and ■ L, where L denotes 
the average load and e is a constant with < e < 4, with high probability, where the hidden constant in 
the notation depends on e. They also showed that the cost of load balancing steps can be amortized 
over the constant costs of insertion and deletion. Again, the constants depend on the value of e. We 
note that this implies a high probability bound of at least 4,096 for the imbalance ratio. 

Ganesan and Bawa [3j and Ganesan, Bawa and Garcia-Molina [3| proposed a distributed load- 
balancing algorithm that works on top of any linear data structures of items. Their algorithm is recursive 
and uses the information of the maximum and minimum-loaded nodes. They guarantee a constant im- 
balance ratio with a constant cost per insertion and deletion. The ratio can be adjusted to 4.237. This 
is much smaller than that of Karger and Ruhl. The major drawback is that their algorithm requires the 
global knowledge of the maximum and minimum-loaded nodes. While this issue is important in practice 
(e.g., when building the real p2p systems), the cost of finding the global information can be amortized 
over the cost of other operations, e.g., node searching. For completeness, we discuss how to obtain this 
global information in Section [5] 

3 Problem setup, cost model and the algorithm of Ganesan et 
al. 

In this section, we describe the problem setup, discuss the cost of an algorithm and review the algorithm 
of Ganesan et al, the AdjustLoad algorithm. 

3.1 Problem setup 

We follow closely the basic setup of [4]. The system consists of n nodes and maintains a collection 
of keys. Let V be the set of all nodes. The key space is partitioned into n ranges, with boundaries 
Rq < Ri < ■ ■ • < Rn- Let Ni be the i — th node that manages a range [Ri-i, Ri). For any node u G V, 
let L(u) be the number of keys stored in u. At any point of time, there is an ordering of the nodes. This 
ordering defines left and right relations among nodes and this relation is crucial to our analysis. 

As in previous work, in some operation, a node requires non-local information, namely the amounts 
of the maximum and minimum loads and the locations of the nodes with the maximum and minimum 
loads. 

When a key is inserted or deleted, the node that manages the range containing the key must update 
its data. After that, the load-balancing algorithm is invoked. Our goal is to maintain the ratio of the 
maximum load to the minimum load, called the imbalance ratio. We say that a load-balancing algorithm 
guarantees an imbalance ratio a if after each insertion or deletion of a key and the execution of the 
algorithm, max u L(u) < a ■ min u L(u) + cq for some constant cq. 

We assume that, initially, each node has a small constant load, cq. As in Q , we ignore the concurrency 
issues and consider only the serial schedule of operations. 

3.2 The cost 

To analyze the cost of a load-balancing algorithm, we follow the three types of costs of Ganesan et al.. 

1. Data Movement. Each operation that moves a key from one node to another is counted as a 
unit cost. 
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2. Partition Change. When load-balancing steps are performed, the range of keys stored in each 
node may change. This change has to be propagated through the system so that the next insertion 
or deletion goes to the right node. 

3. Load Information. This work occurs when a node requests non-local information, e.g., requesting 
for the node with the minimum or maximum load. 

In this paper, we analyze the amortized cost of the load-balancing algorithm, i.e., we consider the 
worst-case cost of a sequence of m load-balancing steps instead of a single one. 

As in Ganesan et al, we first analyze a simpler model that accounts only the data movement cost. 
For partition change cost and load information cost, we assume that there is a centralized server which 
maintains the partition boundaries of each node and can answer the request for global information. 

This assumption can be removed as shown in Ganesan et al.. For completeness, we discuss this in 
Section El 

3.3 The algorithm of Ganesan et al. 

We briefly review the algorithm introduced in [J], the AdjustLoad algorithm. The algorithm uses two 
basic operations: NbrAdjust and Reorder operations, defined as follows. 

NbrAdjust: Node N transfers its load to its neighbor Ni+i. This may change the boundary Ri of 
Ni and JVj+i. 

Reorder: Consider a node Ni with an empty range [Ri, Ri). Ni relocates its position and separates 
the range of Nj. Then, the range [Rj,X) is managed by Nj, whereas the range [X,Rj+±) is managed by 
Ni for some X,Rj < X < Rj+i. Finally, rename nodes appropriately. 

For some constant c and 6, they define a sequence of thresholds T m = [cS m \, for all m > 1, used to 
trigger the AdjustLoad procedure described later. When 6 = 2, they call their algorithm the Doubling 

Algorithm. The AdjustLoad procedure works as well when 6 > <j> — ^ 1+v ^ s» 1.618, the golden 
ratio. They call their algorithm that operates at that ratio, the Fibbing Algorithm. They prove that the 
AdjustLoad procedure running on that ratio would guarantee the imbalance ratio a of 6 3 w 4.237. 

Given the threshold sequence, the AdjustLoad procedure is as follows. When a node iVj's load 
crosses a threshold T m , the load-balancing algorithm is invoked on that node. Let Nj be the lightly- 
loaded neighbor of Ni . If the load of Nj is not too high, the NbrAdjust operation is applied, following 
by two recursive calls of the AdjustLoad on N and Nj. Otherwise, it checks the load of the minimum- 
loaded node Nk, and if the imbalance ratio is too high, tries to balance the loads with the Reorder 
operation as follows. First, N^ transfers its load to its lightly-loaded neighbor N. Then, the Reorder 
operation is invoked on Nk and N, and finally another call to the AdjustLoad procedure is invoked 
on node N. 

4 The algorithm for the insert-only case 

In this section, we present the algorithm for the insert-only case. This case is simpler to analyze and 
provides general ideas on how to deal with the general case. It is also of practical interest because in 
many applications, as in a file sharing, deletions rarely occur on them. 

We present the MinBalance procedure which uses the same primitive operations, the NbrAdjust 
and Reorder operations. However, it is simpler than the algorithm AdjustLoad of Ganesan et al. 
Notably, the MinBalance procedure is not recursive and performs only a constant number of primitive 
operations. We prove the bound on the imbalance ratio of 3 + y/E ~ 5.237 and show that the amortized 
cost of an insertion is a constant. 

4.1 The MinBalance operation 

After an insertion occurs on any node u, the MinBalance operation is invoked. If the load of u is 
more than a times of the load of the current minimum- loaded node v, node v transfers its entire load 
to one of its neighbor nodes, which has a lighter load. After that, v halves the load of u. We call these 
steps, MinBalance steps. Note that, this procedure requires the information of the minimum load; the 
system must maintain this information. 
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Procedure MinBalance (u) 

Let v be the minimum-loaded node in the system, 
if L(u) > a ■ L(v) then 
/ /MinBalance steps 
Let z be the lightly- loaded neighbor of v. 
Transfer all keys of v to z. 



Transfer a half- load of u to v, s.t., L(u) 

{Rename nodes appropriately.} 
end if 



L(u) 



and L(v) 



L{u) 
2 



4.2 Analysis of the imbalance ratio for the insert-only case 

We assume the notion of time of the system in a natural way. For simplicity, we assume that each 
operation completes instantly. 

For any time t, let L t (u) and L' t (u) denote the load of node u after time t and right before time t, 
i.e., t — e for some e > 0, respectively. Let Min t and Min' t denote the minimum load in the system after 
time t and right before time t respectively, i.e., Min t = min u( zv L t (u) and Min' t — min ue v L' t (u). 

We shall prove that the following invariant holds all the time. 

For any time t, the load of any node is not over (a + 2) times of the minimum 

LOAD. 

Note that, the ratio of the maximum load to the minimum load is bounded by a + 2, i.e., -jjff^ < (a + 2) 
where Maxt is the maximum load at time t. We prove this invariant in Theorem [1] 

4.2.1 Overview of the analysis 

While the analysis is rather involved, the idea is not very difficult to understand. This section gives an 
short overview to the analysis. 

The main idea of the analysis is to prove that the imbalance ratio remains under a constant after any 
operation. We assume that the system starts with a uniform load distribution. We first show the key 
property for the insert-only case, that is, the minimum load never decreases. Then, we analyze how the 
insertions change the loads of the nodes. 

When an insertion occurs on any node u, there are two cases depending on if u calls MinBalance 
steps or not. After insertion, the bound on the load of u itself can be verified easily. 

However, the insertion also affects two other nodes, i.e., the minimum loaded node v and its lighter 
neighbor z. When MinBalance steps are invoked, the minimum loaded node v transfers its entire load 
to z and shares a half load of u. While showing the bound on the load of v is pretty straight-forward, the 
bound on z requires more analysis because z may receive loads many times through out the execution 
of the algorithm. The harder case for showing the bound on z is when z repeatedly takes loads from 
its neighbors without any insertions to z. However, in that case, we know that z's neighbors at some 
point become the minimum loaded node; thus, we can establish the bound on the load of z relative to 
the minimum load and prove the required ratio. 

4.2.2 The analysis of the imbalance ratio 

First, we show an important property of the minimum load in the system, i.e., after any operation, the 
minimum load never decreases. 

Lemma 1 Suppose a > 2. Then the minimum load of the system never decreases. 

Proof: Inserting keys into the system only increases load. The only steps that decrease loads are 
the MinBalance steps; therefore, we consider only these steps. For each MinBalance steps, there 
are three nodes involved, i.e., node u which initiates the MinBalance steps at some time t, the current 
minimum-loaded node v and node z which is the lightly-loaded neighbor of v at time t. First, v transfers 
its entire keys to z; hence z's load increases. The new load of each of u and v is a half of it's current 
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(a) (b) 

Figure 1: (a) The left-transfer event on z. (b) The right-transfer event on z. 



load. Since u invokes the MinBalance steps, its load must more than a ■ Min t > 2 • Min t . Thus, we 
have that after the steps, it's load and u's load are at least Mint as required. I 

After an insertion occurs on any node u, the load of u changes. Next lemma guarantees a good ratio 
on node u. 

Lemma 2 Consider an insertion occurring on node u at some time t. Suppose a > 3. After an insertion 
and its corresponding load-balancing steps, L t (u) < a ■ Min t . 

Proof: After an insertion, we consider two cases. The first case is when the MinBalance steps are 
invoked. After that the load of u decreases by half. Note that, the load of u after insertion cannot be 
over (a + 2) • Min' t + 1 which comes from the invariant and a key insertion. Then, we have that 



L t {u) < 



(a + 2) • Min' t + 1 



< 



(a + 2) • Min' t + Min' t 



For any a > 3, it follows that (°+ 3 ) < a. Hence, L t (u) < a ■ Min' t . From the non-decreasing property, 
we have L t (u) < a ■ Mint- 

We are left to consider the second case when the MinBalance steps are not invoked. From the 
algorithm, the load of u in this case is not over a ■ Min' t . Because the minimum load never decreases, 
we have L t (u) < a ■ Mint- i 

Note that, the MinBalance procedure "sees" the load of the newly inserted node and the minimum 
load, while it ignores the load of node z, the previous neighbor of the minimum-loaded node. We define 
the min-transfer event, i.e., we say that a min-transfer event occurs on node z, when the minimum- loaded 
node transfers its load to z, its lightly-loaded neighbor. Most of our analysis deals with the load of nodes 
suffered from this kind of transfer. 

Consider a sequence of min-transfer events occurring on node z. Let U represent the time after the 
i-th min-transfer event occurs on z. Note that, before a min-transfer event occurs on z, an insertion may 
occur on it. Next lemma shows that the load of z has a good ratio in this case. 

Lemma 3 Suppose that an insertion occurs on node z at some time t right before the i-th min-transfer 
event occurs on z. Then L ti (z) < (a + 1) * Min ti . 

Proof: From Lemma [21 after an insertion occurs on z at some time t, L t (z) < a ■ Mint- The load of 
z increases again when the i-th min-transfer event occurs at time fy. Thus, 

L u {z) = L t (z) + Min' u < a ' Min t + Min' ti - 

Therefore, we have that L ti (z) < (a + 1) • Min ti because of the non-decreasing property of the minimum 
load. I 

As the result of Lemma [31 later when dealing with the load of node z, we only have to consider the 
case that no insertion occurs on z before a min-transfer event occurs on it. 

In our analysis, we categorize a min-transfer event into two types: the left-transfer event and the 
right-transfer event. The left-transfer event on z is a min-transfer event that z gets the keys from the 
node, which is to the left of z (see Figure [TJ (a)). The right-transfer event on z is a min-transfer event 
that z gets the keys from the node, which is to the right of z (see Figure Q] (b)). 
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When a new min-transfer event occurs on z, there are two situations, i.e., the new min-transfer event 
is the same type as the previous min-transfer event and the new min-transfer event is the other type. 

The next lemma bounds the load of z when I min-transfer events of the same type occur on z (probably 
not one after another). Note that, it is straightforward to show that the bound of (a + I + 2) but we 
need a better bound. 

Lemma 4 Suppose a > 1 + y5 rj 3.237 and I > 0. Consider the i-th, (i + l)-th, (i + I + l)-th 
min-transfer events on z. If the i-th and (i + I + l)-th min-transfer events are of the same type, while 
the (i + l)-th, (i + 2)-th, (i + l)-th min-transfer events are of type different from that of the i-th and 
(i + l + l)-th events, then after the (i + l + l)-th min-transfer event, L ti+l+1 (z) < (a + 1 + 1) ■ Min ti+l+1 . 

Proof: Without loss of generality, we assume that the i-th. and (i + I + l)-th min-transfer events 
occurring on z are the right-transfer event; and the (i + l)-th, (i + 2)-th, (i + l)-th are the left-transfer 
event. Let Vi be the minimum-loaded node at time U. 

We first deal with the case that there exists an insertion occurring on z between the i-th and (i + l + 1)- 
th min-transfer events. Assume that the latest insertion occurs on z right before the fc-th min-transfer 
event where i < k < i + l + 1. From Lemma after the fc-th min-transfer event, the load of z is not over 
a + 1 times of the minimum load at time tf., i.e., 



L tk (z) < (a + l)-Mm tk . 
Since no insertion occurs on z after the fc-th min-transfer event, we have 



L u+l+1 (z) < (a + 1) ■ Min tk + Min tk+1 H h Min u+l+1 . 

Because the minimum load never decreases, it follows that 



L U+i+i(z) < (a + 1) • Min u+l+1 + (i + I - k) ■ Min ti+l+1 , 
and finally we have L ti+l+1 {z) < (a + 1 + 1) • Mirii+i+x, because k > i. 

We are left to consider the case that there is no insertion into z between the i-th and (i + I + l)-th 
min-transfer events. We make the first claim. 

Claim 1 

Proof: To prove the claim, we note that for any i<j<i + l + l, the load of z right before the j-th 
min-transfer event is equal to its load after the (j — l)-th min-transfer event, i.e., 

L' t .{z) = L tj _ 1 {z). (1) 

Also, for any i < j < i + l + 1, the load of z after the j-th min-transfer event increases by the minimum 
load before time tj, Min' tj . Thus, 

L ti (z)=L' t .(z)+Min' t ., (2) 

and from Eq. (fTJ), 

L tj (z) = L tj _ 1 (z)+Min' t .. (3) 

Telescoping, we have 

L u+l+x {z) < LtM + Min' u+1 + Min' u+2 + ■■■ + Min' u+i+i < L u {z) + {I + 1) • Min' t . +l+1 . 

The last step is from the non-decreasing property of the minimum load; and the claim follows. I 

Now, consider the i-th min-transfer event on z. Recall that the i-th min-transfer event is a right- 
transfer event. Let x be the node on the right of vi right before the i-th min-transfer event occurs on z 
(see Figured] (a)). Note that, because the ii + 1 + l)-th min-transfer event is a right-transfer event, x 
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Z V, X key space Z X y V i+ | +1 key space 

(a) (b) 
Figure 2: (a) The i-th min-transfer event on z. (b) The node set P between x and Vi+i+\. 

must become the minimum- loaded node at some point after tj. Let t* be the time that x becomes the 
minimum- loaded node after t{. Node x plays a crucial role in our analysis. 

There are two cases depending on x calls MinBalance steps or not. 

Case 1: x does not call MinBalance steps between time ti and 

In the i-th min-transfer event occurring on z, Vi transfers its entire load to z at time ti, but not 
x. That means the load of z right before the i-th min-transfer event is not over the load of x, i.e., 
L' t .(z) < L' t .(x). Then, after time ti, 

L t . (z) = L' t . (z) + Min' t . < L' t . (x) + Min' t . , (4) 

and from the Claim [1] 

Lt i+l+1 (z) < L' u (x) + Min' ti + (I + 1) • Min' ti+l+1 , (5) 

After time ti, x becomes the minimum- loaded node at time t* . Note that, x does not call the 
MinBalance steps. Then, the event that may occur on x after time ti is an insertion or a min-transfer 
event. Thus, x's load does not decrease. We have L' t (x) < Mint*- Then, 

L u+l+1 (z) < Min t *+Min' t . + (J + l) • Min' u+1+1 . 

From the non-decreasing property of the minimum load, we have L ti+l+1 (z) < (1 + 3) ■ Min ti+l+1 and 
when a > 2, it follows that L ti+l+1 (z) < (a + I + 1) • Min ti+l+1 . 
Case 2: x calls MinBalance steps between time ti and ti + i + \. 

Let t' be the latest time that x calls MinBalance steps. Since MinBalance steps are invoked, 
some insertion must occur on x. After that, x's load is over a times of the minimum load at time t' . 
Then, the load of x is divided into halves, i.e., 



U>(x) > 



aMirit 



Note that, after time t' , x does not call MinBalance steps and it becomes the minimum-loaded 
node at time t* . The event that may occur on x after time t' is an insertion or a min-transfer event. 
Thus, x's load after time t' does not decrease. Therefore, L v (x) < L t *(x). Moreover, we know that 
Lf (x) < Min' t . +i 1 by the non-decreasing property of the minimum load. Then, 



Min' f > 



aMin t ' 



(6) 



From the invariant, the load of z at time ti is not over (a + 2) times of the minimum load at time U. 
We know that the minimum load cannot decrease; we have 

L ti (z) < (a + 2) • M in u < {a + 2) • Min v ■ (7) 
Consider the Claim [TJ We divide it by Mim + i + i, i.e., 

VmM < L u (z) + (l + l)-Min' ti+i+i 
Miiii+i+i ~ Min t 
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▼ 



▼ 
▼ 



▼ 
▼ 



Vk+1 key space V|<_-| Z 



Vk+1 keys 



Vk 



(a) 



(b) 



V k -1 
(c) 



Vj^+i key space 



Figure 3: (a) the fc-th and (fc + l)-th min-transfer events are the same type, (b) the (fc — l)-th and 
fc-th min-transfer events are the same type but the (fc + l)-th min-transfer event is a different type, (c) 
the (fc — l)-th and (k + l)-th min-transfer events are the same type but the fc-th min-transfer event is a 
different type. 

By the non-decreasing property of the minimum load, Min' ti l x < Min ti+l+1 . We have 

L u+l+1 (z)_ < Lu{z) + {l + l)-Min' ti+l+1 



Min 



and from Eq. @ and ©, 



i+l+l 



Min 



ti+i+i 
L u(z) 



Mini 



ti+i+i 



L t, +l+1 (z) K (a + 2)- Min t , 

Mim+i+i - 



(/ + !) + 



2(a + 2) 



a 



From the assumption that a > 1 + it follows that 2( - a a l ~ 2 ' 1 < a. Then, we have 



Min 



i+l+l 



<(a + l + l). 



Using previous lemmas, we can conclude the bound on the load of z after a min-transfer event 
occurring on it. 



Lemma 5 Suppose a > 1 
(a + 2) ■ Min u . 



y/5 w 3.237. t/ie i-tt min-transfer event occurs on z, L ti (z) < 



Proof: We prove by induction on the number of min-transfer events occurring on z. 

For the base case, the first min-transfer event occurs on z. If some insertion occurs on it before the 
min-transfer event, t from Lemma O L tl (z) < (a + 1) • Min tl . We are left to consider the case that 
no insertion occurs on z before the first min-transfer event. The load of z before the min-transfer event 
is equal to its load at the beginning of the system, cq. Note that, the minimum load at time t[ is also 
cq. After the first min-transfer event, L tl (z) = 2Min tl . Thus, the load of z in both cases is not over 
(a + 2) ■ Min tl . 

Assume that the load of any node at time tk is not over (a + 2) • Min tk . Consider the (k + l)-th 
min-transfer event on z. There are two cases. 

Case 1: The (k + l)-th and fc-th min-transfer events are the same type (see Figure [3] (a)). From 
Lemma HI when set I = 0, after the (fc + l)-th min-transfer event, L tk+1 (z) < (a + 1) • Min tk+1 . Thus, 
the lemma holds in this case. 

Case 2: The (fc + l)-th min-transfer event is a different type from the fc-th min-transfer event. If 
there is any insertion into z between time tk and tk+x , from LemmaO L tk+1 (z) < (a + I) • Min tk+1 . We 
consider the (fc — l)-th min-transfer event. There are three sub cases. 
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Sub case 2.1: There is no the (k — l)-th min-transfer event; hence, the k-th min-transfer event 
is the first min-transfer event. We can bound the load of z after tk like the base case, i.e., L tk (z) < 
(a + 1) • Min tk . The load of z increases again when the (fc + l)-th min-transfer event occurs on it at 
time tk+i, i.e., 

L tk+1 (z) = L f k (z) + M in' tk+i < (a + 1) • M in th + M in' th+1 . 
From the non-decreasing property of the minimum load, we have that L tk+1 (z) < (a + 2) • Min tk+1 . 
Thus, the lemma holds in this sub case. 

Sub case 2.2: The (fc— l)-th min-transfer event is the same type as the fc-th min-transfer event (see 
Figure [3] (b)). From Lemma|U when set I = 0, after the fc-th min-transfer event, L tk (z) < (a + 1) -Min tk . 
The load of z increases again by the (k + l)-th min-transfer event, i.e., 

Lt k+1 (z) = L tk (z) + M in' th+1 < (a + 1) • M m tk + M in' tk+i . 

From the non-decreasing property of the minimum load, we have L tk+1 {z) < (a + 2) • Min tk+1 . Thus, 
the lemma holds in this sub case. 

Sub case 2.3: The (fc— l)-th min-transfer event is a different type from the fc-th min-transfer event, 
i.e., the (k — l)-th min-transfer event is the same type as the (k + l)-th min-transfer event (see Figure[3] 
(c)). From LemmalU when set I = 1, after the (fc + l)-th min-transfer event, L tk+1 (z) < (a + 2) ■ Min tk+1 
and thus, the lemma holds in this sub case. I 

We are ready to prove the invariant. Note that, a load of any node may change from insertion or 
min-transfer event. From Lemma [5] and [SJ we can conclude the invariant, which guarantees the bound 
of any load in the system. 

Theorem 1 Consider the insert-only case. Suppose a > 1 + y/E 3.237. For any node u £ V , after 
any event at time t, L t (u) < (a + 2) • Min t . 

In our load-balancing algorithm, we want to minimize the number of moving keys and to guarantee 
a constant imbalance ratio. Imbalance ratio, a, is defined as the ratio of the maximum to minimum load 
in the system. We show that the imbalance ratio from our algorithm in the insert-only case is a constant. 

Corollary 1 Consider the insert-only case. Suppose a > 1 + v5 ~ 3.237. The imbalance ratio of the 
algorithm is a constant. 

Proof: Prove directly from Theorem [1] I 



4.3 Cost of the algorithm in insert-only case 

Our algorithm uses two operations, i.e., insert and MinBalance operations. We consider the cost of 
each operation. Moving a single key from one node to another is counted as a unit cost. We follow the 
analysis in [4[ based on the potential function method, and use the same potential function. 

Theorem 2 The amortized costs of our algorithm in the insert-only case are constant. 

Proof: Let L t denote the average load at time t and let be the i-th node that manages a range 

We consider the same potential function as Q, i.e., $ = ^ i=1 ^ ^ — - , where c is a 
constant to be specific later. We show that the gain in potential when an insertion occurs is at most 
a constant and the drop in potential when a MinBalance operation occurs pays for the cost of the 
operation. 

Insertion: Consider an insertion of a key occurring on node Nj at time t before any load-balancing 
steps are invoked. Note that, the load of all nodes except Nj does not change during the insertion. Thus, 
the gain in potential, A$, is 

A$ = o (Er=ifoM) a ) - o (L t (Nj)f + c(L t (Nj) + l) 2 c(£r=i(W)) 2 ) 



c 

< - 



Lt + 7; 

((L t (N 3 ) + ir)-c(L t (Nrf 



c(2L t (Nj) + l) 
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From the invariant, L t (Nj) < (a + 2) • Min t . Since L t > Min t > 1, then L t (Nj) < (a + 2) • L t . 
Hence, 

c(2(a + 2) ■ L t + 1) 

A$ < -i-i J ^<c(2-a + 5). 

Lt 

Since an insertion moves a new key to some node, the actual cost of an insertion is a unit cost. Thus, 
the amortized cost of an insertion is bounded by a constant. 

MinBalance: There are three nodes involved, i.e., Nj which calls MinBalance steps, the minimum- 
loaded node Nk and the Nk's lightly- loaded neighbor iVj. When node Nj calls MinBalance steps, Nk 
transfers its entire load to AT;. After that, Nk shares a half the load of Nj. The drop potential is 

c U t (Njf + Lt(Nk) 2 + L t (Nf - 2(^) 2 - (L t (N k ) + L t (iV ; )) 2 ) 
A$ = — = '- 



c[^^-2Lt{N k )L t {N l )) 
Lt ' 

From the invariant, we have L t (Ni) < (a + 2) • L t (N k ). Since Nj calls the MinBalance steps, we 
know that L t (N k ) < M5!i2. We have L t (Ni) < (a + 2) • M5!/l. Then, 

( L t (N 3 ) 2 2(q+2).L t (Ar j -) 2 \ 

V 2 7? J 

A$ > — = - 

Again, from the invariant, we have L t < (a + 2) • L t (Nk) < (a + 2) • L t (Nj). Then, 



A$ > cL t (A^) 



1 2 

2(a + 2) ~ ^ 



The data movement of MinBalance steps is Lt( ^ ]) + L t (N k ) < L t (Nj ) . For any c > ( jr^^j ) , 
we have A$ > L t (Nj). Thus, the data movement cost can be paid by this drop in potential. I 



5 The algorithm for the general case 

In this section, we consider the general case that supports both insert and delete operations. In order to 
deal with the imbalance ratio, after these operations, load balancing steps are invoked. For insertion, we 
perform the MinBalance operation presented in the previous section. On the other hand, for deletion, 
we present the Split operation. 



5.1 The Split operation 

The Split operation is invoked after a deletion occurs on any node u. Let z be the lightly-loaded 
neighbor of u. The load-balancing steps are called when the load of u is less than j fractions of the 
maximum load at that time. There are two types of load-balancing steps depending on the load of z. 
If the load of z is more than I fractions of the maximum load, u averages out its load with z. We call 
these steps, the SplitNbr. In other case, u transfers its entire load to z. After that, u shares a half 
load of the maximum-loaded node. These steps are called the SplitMax. Note that, the SplitNbr 
calls only the NbrAdjust operation but the SplitMax calls both the NbrAdjust and the Reorder 
operations. 

We again note that to be able to perform these operations, the system must maintain non-local 
information, i.e., the maximum load. 
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Procedure Split (u) 

Let w be the maximum-loaded node in the system, 
if L{u) < ^ then 

Let z be the lightly- loaded neighbor of u. 



< 2 • 



L(w) 



then 



L(w) 



2 



if L(zj ^ - - p 
//SplitMax steps 
Transfer all keys of u to z. 
Transfer a half-load of w to u, s.t 
else 

//SplitNbr steps 

Move keys from z to u to equalize load, s.t., L(u 
end if 

{Rename nodes appropriately.} 
end if 



and L(u) = 



(L(u) + L(z)) 
2 



and L(z) 



(L(u)+L(z)) 



5.2 Analysis of imbalance ratio for the general case 

We assume the notion of time and the invariant in the same way as the previous section. We analyze the 
imbalance ratio after any event. In previous section, we analyze the ratio after insertion and min-transfer 
event. In this section, we have to deal with deletion and two more events: 

• The nbr-transfer event on node z is the event that occurs when z receives load from its neighbor, 
which invokes the SplitMax steps, and 

• The nbr-share event on node z is the event that occurs when z shares load with its neighbor, which 
invokes the SplitNbr steps. 

Let Max t and Max' t denote the maximum load in the system after time t and right before time t 
respectively, i.e., Max t = max ue v L t {u) and Max' t — max„ e y L' t (u). 

5.2.1 Overview of the analysis 

The major problem for applying the proofs in the insert-only case to the general case is the assumption 
that the minimum load cannot decrease over time. To handle this, we shall analyze the system in phases. 
Each phase spans the period where the minimum load is non-decreasing; this allows us to apply mostly 
the same techniques to the analyze the situation when the update does not change the analysis phase. 

The only way an update could cause phase change is when there is a deletion in the minimum loaded 
node. In that case, we show that if there is a deletion on the minimum loaded node that starts the phase 
change, the ratio between the minimum load and the maximum load is bounded by a constant. This 
provides the sufficient initial condition for the analysis of the next phase. 

5.2.2 Transition between phases 

We shall analyze the transition between two consecutive phases occurring when the minimum load 
decreases. 

Consider a deletion occurring on any nodes. Next lemma shows the load property of any node after 
deletion occurring on it. 

Lemma 6 Suppose [3 > 3. Consider the case that a deletion occurs on some node u at some time t. 
After deletion and its corresponding load-balancing steps, L t {u) > M|£t Moreover, the minimum load 
can decrease in the case that load-balancing steps are not invoked. 

Proof: After a deletion occurs on u, there are two cases. The first case is when load-balancing steps 
arc not invoked. In this case, the load of u after deletion is more than Ma ^ t . Note that, at time t, the 
only event that occurs in the system is a deletion on u. Then, the maximum load does not increase, i.e., 
Max' t > Maxt- We have that L t {u) > ¥f± > ^f±. 
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We are left to consider the second case when load-balancing steps are invoked. In this case, the load 
of u is not over M "% , Let z be the lightly-loaded neighbor of u at time t. There are two sub cases. 

Sub case 1: Node u calls SplitMax steps. First, u transfers its entire load to z. After that, u 
shares a half load of the maximum- loaded node, i.e., 



L t (u) 



Max', 



Max', 



In this sub case, the node except u that its load increases at time t is z. Note that, z receives load from 
u, i.e., 

r , , w , , rtf . 2Max', Max', 3Max', 
L t (z) = L' t (z) + L' t (u) < — ^ + -j± < 

From the assumption that /3 > 3, L t (z) < Max' t . Since the maximum load does not increase at time t. 
Then, we have L t {u) > M f^. 

Sub case 2: Node u calls SplitNbr steps. This sub case occurs when L' t (z) > ^f^. The load of 
z is shared to u to balance their loads, i.e., 

r . . -^ + L' t {u) Max' t 

In this sub case, there are two nodes involved, u and z. Note that, the only node in the system that its 
load increases at time t is u. We know that L' t (u) < M ° x * < Ma ^t _ Then, u's load after load-balancing 

steps cannot over Max', because its load less than M ° Xt and its received load cannot over Af ° St . Since 
the maximum load does not increase at time t. Therefore, we have L,{u) > M( ^ Xt , 

Consider the load of the minimum-loaded node after deletion occurring on it at time t. When it 
is more than < T' % , load-balancing steps are not invoked. From the first case, the minimum load can 

decrease. When it is not over MaXt | load-balancing steps are invoked. From the second case, the load 
after load-balancing steps is more than ^p. Thus, the minimum load does not decrease after deletion 
in this case. I 



From Lemma [HI the minimum load at time t can decrease in the case that load-balancing steps are 

ax 



not invoked. After the minimum load decreases at time t, we have Mint > Ma J Ct and the phase changes 



Moreover, the following condition holds: 

At the beginning of each phase at time t, the imbalance ratio guarantee is j3, i.e., 

Max, 
Mint < 

5.2.3 Imbalance ratio inside each phase 

We prove the same invariant in the previous section, i.e., 

For any time t, the load of any node is not over (a + 2) times of the minimum 

LOAD, 

holds after each operation. 

At the beginning of each phase, the imbalance load guarantee is /3. At the end, we choose (3 such that 
f3 < a; this implies that at the beginning of each phase, Maxt < ftMint < (a + 2)Mint, as required. 

In our analysis later on, we ignore the case of the deletion which is not followed with load-balancing 
steps, because the load of the affected can never violate the ratio. Thus, the events of this type shall not 
be considered in our analysis. 

To analyze the imbalance ratio, we consider how the load of each affected node changes. We deal 
with two easy events first. For insertion, we use Lemma [5] to guarantee the load of inserted node and 
for deletion, we use Lemma [6] to guarantee the load of deleted node. For the rest of this section, we 
are left with the node which effected from load-balancing steps. There are three types of events, i.e., 
nbr-transfer, nbr-share and min-transfer events. 

We summarize the events and how to deal with them as follows. 
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For the nbr-transfer events, we focus on a lightly-loaded neighbor of the deleted node. It receives 
an entire load of the deleted node. Lemma [7] deals with this type of events. 



• For the nbr-share events, a lightly-loaded neighbor of the deleted node averages out its load with 
the deleted node. The imbalance ratio is also proved in Lemma [7] 

• For the min-transfer events, we focus on the load of a lightly- loaded neighbor of the minimum- 
loaded node. It receives load from the minimum-loaded node. Lemma [TT] deals with this type of 
events. 

The next lemma handles the case of nbr-transfer and nbr-share events. 

Lemma 7 Suppose a > 3 and f3 > 3 ^°^" 2 ^ . Consider the case that a deletion occurs on node u at 
some time t and u calls load balancing steps. Let z be the lightly-loaded neighbor of u at time t. Then 
Lt{u) < a ■ Min t ond L t (z) < a ■ Mint- 

Proof: There are two types of load balancing steps after deletion. 

Case 1: Node u invokes SplitMax steps and a nbr-transfer event occurs on z. This case occurs 
when L' t (z) < 2M ® x t anc [ L' t (u) < M ® x t . Node u proceeds by transferring its entire load to z, i.e., 

. , 2Max' f Max', 

From the invariant, we have L t (z) < 3 (( a+2 V Mm t) _ Since j3 > 3( " a l ~ 2 - > , it follows that 3 ( a ^~ 2 ) < a. Then, 
the load of z at time t is not over a ■ Min' t . From the non-decreasing property of the minimum load, we 
have L t (z) < a ■ Mint- 

After that, u shares half a load of the maximum-loaded node. From the invariant, we have that 
Lt(u) < ( a+2 ) Mm * , When a > 2, it follows that < a. Then, the load of u after deletion is not 

over a ■ Min' t and we have L t (u) < a ■ Mint from the non-decreasing property of the minimum load. 

Case 2: Node u invokes SplitNbr steps and a nbr-share event occurs on z. This case occurs when 
L't(z) > 2M ® x t anc [ L' t (u) < M< * x t , From the invariant, we have L' t (z) < Max' t < (a + 2) ■ Min' t . Node 
u and z share their loads equally, i.e., 

2 2/3 * 

From the assumption that a > 3 and > 3< - a a 1 " 2 - 1 , we have that (3 > |"^ 2 j . Moreover, when (3 > |"^ 2 j 
and a > 2, it follows that (^+Mfe+ 2 J < a . Then, the load of z and u after load-balancing steps are not 
over a ■ Min' t . From the non-decreasing property of the minimum load, we have L t (u) < a ■ Mint and 
L t (z) < a ■ Min t . I 

We are left with the min-transfer case. Let e be the min-transfer event occurring on z. Next lemma 
deals with the case that e is the first event occurring on z in phase d. 

Lemma 8 Consider any phase d. Suppose (3 > 2. If the first event occurring on z in that phase is the 
i-th min-transfer event, L ti {z) < (/3 + 1) ■ Min^. 

Proof: There are two cases to consider. First, we consider the case that d = 1, i.e., the first phase. 
At the beginning of this phase, the load of any node is equal to cq. Note that, the load of z does not 
change until the first min-transfer event occurs on it. Moreover, at time t\, the minimum load is also cq 
because it cannot decrease and it cannot increase over cq. When the min-transfer event occurs on z, it's 
load increases, i.e., L tl {z) — cq + Min' tl — 2Min' tl . From the non-decreasing property of the minimum 
load, we have L tl (z) < 2Min tl . From the assumption that (3 > 2, it follows that L tl (z) < (f3 + l) ■Mint 1 ■ 

For the case that d > 1, let t' be the beginning time of phase d. From the load condition at the 
beginning of each phase, Maxt> < f3Min t >. Therefore, L t >(z) < f3Min t '. After the i-th min-transfer 
event occurs on z, its load increases, i.e., L ti (z) — L t >(z) + Min' t . < (3 ■ Min t ' + Min' t .. Because the 
minimum load never decreases in each phase, we have L ti {z) <(/? + !)• Min ti as required. I 
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Now, assume that e is not the first event occurring on z in phase d. 

Let e' be the latest event that occurs on z before an event e. The next lemma considers the case 
where e' is not a min-transfer event, while Lemma 1 1 01 which is more involved, considers the case when 
el is a min-transfer event. 

Lemma 9 Consider any phase. Suppose a > 3 and /3 > 3 ^ a " 1 " 2 ^ . If any event e' except a min-transfer 
event occurs on node z right before the i-th min-transfer event, then after the i-th min-transfer event 
occurs on z, L ti (z) < (a + 1) ■ Min ti . 

Proof: An event e' can be an insertion, or a deletion which is followed with load-balancing steps, 
or an nbr-transfer event, or an nbr-share event. After event e' occurs on z at time t', from Lemma [2] 
(for insertions) and Lemma [7] (for other events), we have L t '(z) < a ■ Mint'. The load of z increases 
again from the z-th min-transfer event, i.e., L ti (z) — L t '{z) + Min' t . < a ■ Mint' + Min' t . Because the 
minimum load does not decrease in each phase, it follows that L ti (z) < (a + 1) • Min ti . 1 

Finally, we consider the case when the latest event e' before event e is a min-transfer event. Recall 
that we categorize the min-transfer event into two types: the left-transfer event and the right-transfer 
event. The next lemma is a generalization of Lemma |4] but it deals more with deletion, SplitMax and 
SplitNbr. 

Lemma 10 Suppose a > « 4.373, f3 > ^±2) an d i > o. Consider the i-th, (i + l)-th, 

(i + I + l)-th min-transfer events on z in any phase. If the i-th and (i + I + l)-th min-transfer events 
are of the same type, while the (i + l)-th, (i + 2)-th, (i + l)-th min-transfer events are of type 
different from that of the i-th and {i + I + l)-th events, then after the (i + I + l)-th min-transfer event, 
< (a + I + 1) ■ Mint i+l+1 . 

Proof: Without loss of generality, we assume that the i-th and (i + I + l)-th min-transfer events 
occurring on z are right-transfer events; and the (i + l)-th, (i + 2)-th, (i-M)-th are left-transfer events. 
Let Vi be the minimum- loaded node at time U. 

We first deal with the case that there exists an event except the min-transfer event occurring on 
z between the z-th and (i + I + l)-th min-transfer events. Assume that the latest event except the 
min-transfer event occurs on z right before the fc-th min-transfer event where i<k<i + l + l. From 
Lemma IH1 after the fc-th min-transfer event, we have 

L tk {z) < {a + l)-Min tk . 

After the k-ih min-transfer event, only the min-transfer event can occur on z. Note that, after time 
ifc, there are at most i + I — k min-transfer events occurring on z at time U+i+i. Since, the minimum 
load never decreases, we have that 

L u+i+i{z) < (a + 1) • Min ti+l+1 +(i + l - k) ■ Min ti+l+1 , 
and finally we have L ti+l+1 (z) < (a + I + 1) ■ Mini + i + i, because k > i. 

We are left to consider the case that there is no other events occurring on z except the min-transfer 
event between the i-th. and (i + l + l)-th min-transfer events. We follow Claim[T]in the proof of LemmaH] 
using the property that in each phase the minimum load does not decrease. Then, we have 

< L *M + (* + 1) ■ M K +l+ i- ( 8 ) 

Consider the i-th min-transfer event on z. Recall that, it is a right-transfer event. Let x be the 
node on the right of Vi right before the «-th min-transfer event. Note that, because the (i + I + l)-th 
min-transfer event is also a right-transfer event, x must become the minimum- loaded node at some point 
after time ti. Let t* be the time that x becomes the minimum-loaded node after time U. 

Instead of focusing on node x, we consider a set of nodes P that arranges in a consecutive order from 
x to v. L+ i + i between time U and ii+j+i (see Figure [2] (b)). We bound the load of z by the load of node 
in this set. There are two cases. 

Case 1: No nodes in set P calls MinBalance, SplitMax and SplitNbr steps between time U 
and ti + i + \. In this case, we consider the node x. Note that, deletion may occur on x. There are two sub 
cases. 
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Sub case 1.1: No deletion occurs on x after ti. Consider the i-th. min-transfer event occurring on 
z. The minimum-loaded node at time ti transfers its load to z. That means L' t .(z) < L' t (x). Then, we 
have 

L u (z) = L' t .(z) + Min' t . < L' u {x) + Min' t ., 

and from Eq. (JSJ), we have 

L ti+l+1 (z) < L' u {x) + Min' u + (1 + 1) • Min' ti+l+1 . (9) 

After time ti, deletion and load-balancing steps do not occur on x. The operation that can occur on 
it is an insertion. Then, x's load does not decrease. At time t*, x becomes the minimum-loaded node. 
Then, L' t .(x) < Mint*. From Eq. ©, we have 

(*) < Min t , + Min' u + (* + 1) • Min' t . +l+1 . 

Because the minimum load does not decrease, we have L t . +i+1 (z) < (/ + 3) ■ Mint i+l+1 . When a > 2, 
we have L t (z) < (a + l + 1) ■ Min ti+l+1 . Thus, the lemma holds in this sub case. 

Sub case 1.2: Some deletion occurs on x after ti but SplitMax and SplitNbr steps are not 
invoked. Let t' be the time that x has the minimum load after deletion occurring on it between time U 
and t*. From Lemma [6l the maximum load at time t' is not over j3 times of the load of x at time t' . 
Then, we have 

Max t , </3L t ,(x) <0L t *(x). (10) 

Note that, the only event that can occur on z between time U and t' is a min-transfer event. Thus, 
the load of z after ti does not decrease. Then, we have L ti (z) < L t '(z). Consider Eq. ([5]). We have 

L u+l+1 (z) < L t , (z) + (I + 1) • Min' ti+l+1 . 

From Eq. (TIT)]) , we have 

L ti+t+1 (z) < Max t , + (I + 1) • Min' u+1+1 < pL t *(x) + (I + 1) • Mm' t%+i+i . 

Since in each phase, the minimum load never decreases, we have L ti+l+1 (z) < (j3 + 1 + 1) ■ Min ti+l+1 . 

From the assumption that a > 3+ ^^ , it follows that a > f3 and thus, the lemma holds in this sub case. 

Case 2: At least one node in P calls MinBalance or SplitMax or SplitNbr steps between time 
ti and ti+i+i- Let y be the latest node in this set that calls load-balancing steps at time t. We consider 
the case that some deletion occurs on y after t. We can prove in the same way as sub case 1.2 and it 
follows that Lt t (z) < (a + l + 1) - Min ti+l+1 . 

We are left to consider the case that no deletion occurs on y after MinBalance or SplitNbr or 
SplitMax steps occurring on y after t. 

Sub case 2.1: Node y calls MinBalance steps at time t. This sub case can be proved in the same 
way as case 2 in Lemma |H When a > 1 + V5, it follows that Min + + i+1 — ( a + I + !)■ Thus, the lemma 
holds in this sub case. 

Sub case 2.2: Node y calls SplitNbr steps at time i. From Lemma HI after SplitNbr steps, we 
have Maxi < /3Lf(y). Note that, the event that occurs on z after ti is the min-transfer event. Then, z's 
load does not decrease after ti. We have L ti (z) < L^(z). Consider Eq. ([5]). Then, it follows that 

L ti+l+1 (z) < Max { + (l + l)- Min' u+i+i < /? • L t (y) + (l+l)- Min' ti+l+1 . 

After time t, no load-balancing steps or deletion occurs on y. Then, after t, y's load never decreases. 
Since the (i + I + l)-th min-transfer event is a right-transfer event, y must become the minimum-loaded 
node at some point. Let t be the time that y becomes the minimum-loaded node. It follows that 
Lf(y) < L t »(y). Then, we have L ti+l+1 (z) < (0 + 1 + 1)- Min t . +l+l . When a > 3+ ^ M , it follows that 
a > (3 and thus, the lemma holds in this sub case. 

Sub case 2.3: Node y calls SplitMax steps at time t. After y calls these steps, it transfers its entire 
load to its neighbor w. Then, we have Lf(w) > 2Minf. After that, y is relocated. Thus, we consider 
node w instead. 
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We show that w is in P. Assume by contradiction that w is not in P. Note that, the position of w 
can be left or right of y. Consider the case that w is right of y. w can be outside P when y must be the 
rightmost node in P. Because is the rightmost node in P, this case contradicts. Consider the case 

that w is left of y. In this case, y must be the leftmost node in P and y transfers its load to the node 
outside P. Note that, the node which y transfers its load is z. This contradicts that no events occur on 
z except the min-transfer event. 

Consider the case that some deletion occurs on w after t. This case can be proved in the same way 
as sub case 1.2. 

Now, we assume that no deletion occurs on w after t. Recall that at time t, we have 2Min f - < L$(w). 
Since the (i + I + l)-th min-transfer event is a right-transfer event, w becomes the minimum-loaded 
node at some point after t. Let t* be the time that w becomes the minimum-loaded node after i. After 
time t, load-balancing steps and deletion do not occur on w. Then, w's load does not decrease. We 
have L^(w) < L t *(w). Moreover, we have L^(w) < Min' t from the non-decreasing property of the 
minimum load. Thus, 2Mint < Mini 

' *> — '-;+/ + 1 

Consider Eq. d8j. We divide it by Mim+i+i, i.e., 

L U+l+1 (*) < L u (z) + (l + l)-Mzn' u+l+l 
Mim+i+i ~ Min i+ i+i 

L ti (z) + (l + l)-Min' ti+l+1 



< 



(l + l) 



Min't 



From the invariant, it follows that L ti (z) < (a + 2) • Min ti < (a + 2) • Mm t - when ti < t. Recall that 



2Min f < Min't ■ Then, 



L U +l+ M <(/ + !)+ (" + 2)-Mm^ 
Mini + i + \ ~ 2Minf 



When a > 2, it follows that < a. Then, we have ^.Hn^+l ^ (a + 1+1). Thus, the lemma 

holds in this sub case. I 

From Lemma [HI [HI and QUI we conclude the effect of the min-transfer event on any node z. We omit 
the proof because it is similar to the proof of Lemma [SJ 

Lemma 11 Consider any phase ol. Suppose a > 3+ ^^ ~ 4.373 and j3 > M££±Hi _ After the i-th min- 
transfer event occurs on z, L ti (z) < (a + 2) ■ Min ti . 

Finally, we are ready to prove the imbalance ratio guarantee. In each phase, the load of any node 
can be changed by insertion, deletion, nbr-transfer, nbr-share and min-transfer events. From Lemma [2l 
HI [7] and [TTJ we can conclude the upper bound of load of any node in any phase after these events. 

Theorem 3 Consider any phase. Suppose a > w 4.373 and (3 > 2te±22. For any node u G V , 

after any event at time t, L t {u) < (a + 2) • Mint. Thus, the imbalance ratio of the algorithm in the 
general case is a constant. 



5.3 Cost of the algorithm in the general case 

We analyze the cost of our algorithm in the general case. Recall that, the cost of moving a key from 
node u to another node v is counted as a unit cost. In our algorithm, there are four operations, i.e., 
insertion, deletion, MinBalance and Split. Again, we prove the amortized cost by potential method 
and use the same potential function of Ganesan et ai. 

Theorem 4 The amortized costs of our algorithm are constant. 

Proof: We use the potential function in jij, i.e., $ = ^' =1 ^ — — - , where L t is the average load 
at time t and Ni be the z-th node that manages a range [Ri-i, Ri). We show that the gain in potential 
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when insertion or deletion occurs is at most a constant and the drop in potential when MinBalance 
or Split operation occurs pays for the cost of the operation. These imply that the amortized costs of 
insertion and deletion are constant. 

We prove the cost of insertion and MinBalance operation in the same way as Theorem [2 We are 
left to consider a deletion and Split operation. 

Deletion: When a deletion occurs on Nj at time t, the gain in potential is at most 



(W) l * 
(£i £ y^W)(£) 

We know that L t > Mint- From the invariant, we have that L t (u) < (a + 2) • Min t < (a + 2) • L t for 
any node u. Using the fact that L t — — > 4^ where n > 2 and L t > 1, we have 



Hence, the amortized cost of deletion is a constant because the actual cost of deletion is a unit cost. 

Split: Consider the load of node Nj, whose calls Split at time t. When its load is less than Mc ^ Xt ; 
load-balancing steps are called. There are two cases, i.e., SplitNbr and SplitMax steps. 

Case SplitNbr: Let N k be the lightly-loaded neighbor of Nj when Nj calls SplitNbr steps. Node 
iV/. moves its keys to Nj to balance their loads. The drop in potential is 



c (itiNtf + L t (N k f - 2 ( M^+W y 
+ ^ - (Lt(N 3 )L t (N k ))) 



(L t (N k ) - Lt(Nj)) 2 
2L t 

(L t (N k ) - L t (Nj)) (L t (N k ) - L t (Nj)) 



This case occurs when L t {N k ) > and L t (Nj) < ^f^. Then, 

/ 2Max t Ma 

(L t (N k ) - L t (Nj)) {— 



A$ > 



2 I 

( Max t \ 

(L t (N k ) - L t {Nj)) {—) 



2 L t 
(L t (N k )-L t (Nj)) (I 



> c 

J 

The number of moved keys when SplitNbr steps are invoked is ^ Lt ^ Nk ^ Lt ^ N ^ , F or an y c > /3, we 
have A$ > ^- Ltl ^ Nk '>~ Ltl - N i^ _ Thus, the drop in potential pays for the data movement. 

Case SplitMax: Node Nj transfers its entire load to its adjacent node N k and then shares half the 
load of the maximum- loaded node iVj. The drop in potential is 
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L t {N 3 f + L t {N k f + LtiNtf - 2 (^) 2 - (L t (N k ) + L t {N )f 



(^f_ 2Lt{Nk)Lt{Nj) \ 
c- = — 



L t {Nif 

From the algorithm, we know that L t (N k ) < and L t (Nj) < M f^. Then, 

/ . 2 2Maxt Maxt \ 

A$ > cLt(Ni)[- P ' f> \ 



> c-L t (N t )^ t 



2 L t (Ni) 

1 4 

2 ~ ^ 



When SplitMax steps are invoked, the number of moved keys is Lt ^ 1 ^ + L t (Nj) < L t (Ni). For any 
c > (/p~8^ i it follows that A$ > L t (Ni) and thus, the drop in potential pays for the key movement. 
Thus, the amortized costs of the algorithm are constant. I 



6 The cost in real networks 

In this section, we discuss how Ganesan et al. Q dealt with the global information and, again, discuss the 
comparison between this line of work, which this paper extends, and the work of Karger and Ruhl [jjllioj 
when considering real networks. 

In the peer-to-peer networks, there is no centralized server to provide the information. If any node 
wants the information of another node, it must send messages to that node. Besides the data movement 
cost, there is another cost to be considered, the communication cost. The communication cost is defined 
as a number of messages that required for complete the operation. 

We shall discuss how Ganesan et al. implement the idea on real networks. Their implementation 
is based on skip graphs 1]. Skip graphs support find operation, node insertion, and node deletion 
with O(logn) messages with high probability. Also adjacent nodes can be contact with O(l) messages. 
Ganesan at al. use two skip graphs: one where nodes are ordered by their minimum key in their ranges; 
another where nodes are ordered by their loads. Therefore, global information can be found with 0(log n) 
messages, and each partition change costs at most O(logn) messages. We note that while this cost is 
more than the constant cost of data movement, usually O(logn) messages are required for finding the 
node for each key, and thus these cost can be amortized with the searching cost. 

As in Ganesan et al. works, Karger and Ruhl 9, 10] simplify the cost model by not considering how 
one could find a given node in the system. Therefore, unless they maintain a global directory of nodes, 
using known data structures for p2p systems, they still need O(logn) messages as well. 
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